Improve emulator re-initialisation #872

radka-j · 2025-10-06T09:42:52Z

Closes #748
Closes #874
Closes #878
Closes #757

This PR:

adds fit_from_reinitialised method that is used both in AutoEmulate.compare and HMW.refit_emulator
emulators now save all their input args so that all input values can be retrieved
replaces any **kwargs in emulators with scheduler_kwargs optional keyword argument to match use
update HMW so that user can pass emulator as well as result
updates Emulator.fit to handle InputLike instead of expecting only TensorLike
updates AL to except emulator predictions to be DistributionLike rather than GaussianLike to match TransformedEmulator prediction types
updates AL to use fit_from_reinitialized

… input arg to emulator

review-notebook-app · 2025-10-06T10:17:21Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

sgreenbury · 2025-10-06T13:26:45Z

Just adding a note here as ran into this when working with a GP subclass for the error quantification. This call:

autoemulate/autoemulate/core/compare.py

Line 578 in 76689ae

model_class = get_emulator_class(result.model_name)

fails since:

autoemulate/autoemulate/emulators/__init__.py

Lines 68 to 70 in 76689ae

    
           emulator_cls = EMULATOR_REGISTRY.get( 
        
               name.lower() 
        
           ) or EMULATOR_REGISTRY_SHORT_NAME.get(name.lower())

doesn't also look at:

autoemulate/autoemulate/emulators/gaussian_process/exact.py

Lines 460 to 463 in 76689ae

    
           GP_REGISTRY = { 
        
               "GaussianProcess": GaussianProcess, 
        
               "GaussianProcessCorrelated": GaussianProcessCorrelated, 
        
           }

@radka-j - adding here as it might be addressed by the upcoming changes to this API? But if not happy to open a new issue to look at this. An option could also be to revisit having a central registry class to handle this uniformly.

radka-j · 2025-10-06T13:38:25Z

@sgreenbury I don't think we should ever use the GaussianProcess or GaussianProcessCorrelated classes so this to me feels like correct behaviour. If we want a GP class for an RBF + constant kernel we should add that specifically to the registry.

sgreenbury · 2025-10-06T14:06:04Z

It was the GP context (passing a create_gp_subclass instance to AutoEmulate) I ran into this issue and a workaround might have been to also look at GP_REGISTRY since this maintains a registry of all GPs including the created subclasses.

But thinking more about it, it affects any subclass used by AutoEmulate currently if reinitialize is called, e.g. in the advanced tutorial:

class SimpleFNN(PyTorchBackend):
    ...
ae = AutoEmulate(x, y, models=[SimpleFNN])
ae.fit_from_reinitialized(x, y)

since SimpleFNN is constructed at runtime the class is not found in the lists of emulators.

I think if the emulator becomes the entity that does the refitting in this PR then a global emulator registry including all custom subclasses would not be needed for this but might still be useful?

… GaussianLike to match TranformedEmulator predict type

radka-j · 2025-10-13T16:19:09Z

The lodget, trace and max_eigval plots in the AL documentation look wrong after the refactor here (they barely change). I started trying to figure out what's happening and have a sense that the predicted uncertainty is narrowed when using a GP wrapped inside a TransformedEmulator (even without any transforms) vs just a GP. I need to investigate this more formally but we need to understand what's happening before we can merge this.

radka-j · 2025-10-14T09:50:21Z

I don't know what the issue is yet but my previous comment about the uncertainty from TransformedEmulator being narrower was wrong. I was comparing GP vs TransformedEmulator with GP using different learning rates. Once the same learning rate was used they look visually identical.

case_studies/patient_calibration/patient_calibration_case_study.ipynb

…y.ipynb

sgreenbury · 2025-10-14T10:20:52Z

It might be related to whether posterior_predictive=True is being passed to the reinitialized GP when within the TransformedEmulator?

For example, on main in the dim reduction tutorial:
https://github.com/alan-turing-institute/autoemulate/blob/6d4a92fdcb2614b5dee5f907855e7003503c0910/docs/tutorials/emulation/02_dim_reduction.ipynb

em = ae.fit_from_reinitialized(x[train_idx], y[train_idx])

has:

print(em.model.posterior_predictive)
False

though the original AutoEmulate initialization having posterior_predictive=True.

radka-j · 2025-10-14T10:33:13Z

Thank you for checking! In this case the posterior_predictive is correctly set to True after the emulator is re-initialized each time.

radka-j · 2025-10-14T10:34:18Z

@sgreenbury I'm also not sure if you saw my previous comment but the uncertainty output from TransformedEmulator seems to be fine.

radka-j · 2025-10-14T13:38:06Z

@sgreenbury I tried running the AL notebook using a GP wrapped inside a TransformedEmulator but calling emulator.fit instead of fit_from_reinitialized as originally implemented and the results look the very similar to the current docs. So it looks like the issue comes from re-initializing the emulator. Given the GP is refitting 1 data point at a time, this might be a case where calling fit with the hyperparameters fixed might actually make sense.

I therefore decided to revert this change and leave AL as is in this PR (only updating typing). We can separately decide whether to leave the associated issue (#757) open to revisit at some later point or close.

…sed AL

case_studies/patient_calibration/patient_calibration_case_study.ipynb

…y.ipynb

autoemulate/calibration/history_matching.py

Co-authored-by: Sam Greenbury <[email protected]>

autoemulate/calibration/history_matching.py

Co-authored-by: Sam Greenbury <[email protected]>

autoemulate/calibration/history_matching.py

Co-authored-by: Sam Greenbury <[email protected]>

autoemulate/learners/base.py

sgreenbury

Looks great, thanks @radka-j! As we discussed:

it looks like choice of emulator reinitialization could be good to have in the API
there's an issue following our discussion capturing revisiting the overall workflow (#893)
the dimensionality reduction tutorial seems to not pick up the model_params={"posterior_predictive": True}

There is the comment above about DistributionLike not always having mean/variance - I don't think we'll run into this currently but might be good to either restrict here with the instance matching or have an issue for it.

Otherwise looks good to merge!

radka-j added 4 commits October 6, 2025 10:04

rm kwargs from RF

2ca9a94

save mlp kwargs in ensembles

3ce9677

save all MLP and GP input params

cb44c83

update HMW to work with Emulator as well Result object, rename result…

5346644

… input arg to emulator

radka-j added 2 commits October 6, 2025 13:46

add option to pass transformed_emulator_params

23596b1

Merge branch 'iss867/update_gp_factory' into reinitialise

94b7029

Base automatically changed from iss867/update_gp_factory to main October 6, 2025 13:19

radka-j added 2 commits October 6, 2025 14:31

make mlp_kwargs a keyword argument in MLP ensembles

d9503f1

make scheduler_kwargs a keyword argument

4e6e742

radka-j added 17 commits October 6, 2025 16:17

add scheduler_cls input keyword arg

c08f610

check x/y standardization from emulator object

e0ea75b

update correlated GP

dacefba

Merge branch 'main' into reinitialise

5e25da2

fix scheduler kwarg passing to scheduler_setup

8439e4a

update scheduler_setup method

38283a8

fix test

7a925b2

update scheduler tests

3c523a6

add reinitialize method

f6a2377

add reinitialize method

5b0d9f2

add tensor conversion and device handling to TransformedEmulator

4102bbd

fix var order

d698bb8

update Emulator.fit to expect InputLike, not TensorLike

36a8b64

refactor fit_from_reinitialised function

12371a6

update learners tests

264d6a6

use fit_from_initialized in learners

aa18dca

revert changes in learners

dee04f5

radka-j requested a review from sgreenbury October 10, 2025 07:36

sgreenbury mentioned this pull request Oct 10, 2025

Revisit handling of conversion to tensors and ensuring shape #886

Open

use fit_from_initialized in AL, change types to DistributionLike from…

39797ad

… GaussianLike to match TranformedEmulator predict type

radka-j commented Oct 14, 2025

View reviewed changes

case_studies/patient_calibration/patient_calibration_case_study.ipynb Outdated Show resolved Hide resolved

radka-j added 2 commits October 14, 2025 11:13

Update case_studies/patient_calibration/patient_calibration_case_stud…

a5fae9b

…y.ipynb

update docstrings

ae942e5

avoid code repetition

8498cb3

revert to emulator.fit instead of fit_from_reinitialised in stream ba…

85bf20d

…sed AL

radka-j commented Oct 14, 2025

View reviewed changes

case_studies/patient_calibration/patient_calibration_case_study.ipynb Outdated Show resolved Hide resolved

Update case_studies/patient_calibration/patient_calibration_case_stud…

ce6ea04

…y.ipynb

sgreenbury mentioned this pull request Oct 14, 2025

Revisit active learning workflow #893

Open

sgreenbury reviewed Oct 14, 2025

View reviewed changes

autoemulate/calibration/history_matching.py Outdated Show resolved Hide resolved

Update autoemulate/calibration/history_matching.py

b5b902d

Co-authored-by: Sam Greenbury <[email protected]>

sgreenbury reviewed Oct 14, 2025

View reviewed changes

autoemulate/calibration/history_matching.py Show resolved Hide resolved

radka-j and others added 2 commits October 14, 2025 16:36

Update autoemulate/calibration/history_matching.py

bcd2939

Co-authored-by: Sam Greenbury <[email protected]>

add option to change whether fit from reinitialized or not in AL

704fe37

sgreenbury reviewed Oct 14, 2025

View reviewed changes

autoemulate/calibration/history_matching.py Show resolved Hide resolved

radka-j and others added 2 commits October 14, 2025 16:41

increase learning rate in AL tutorial, set posterior_predictive=False

43039f1

Update autoemulate/calibration/history_matching.py

40d7530

Co-authored-by: Sam Greenbury <[email protected]>

sgreenbury reviewed Oct 14, 2025

View reviewed changes

autoemulate/learners/base.py Show resolved Hide resolved

sgreenbury approved these changes Oct 14, 2025

View reviewed changes

raise error if output distribution does not have a variance property

597977d

radka-j merged commit 3a63ee4 into main Oct 15, 2025
5 checks passed

radka-j deleted the reinitialise branch October 15, 2025 10:11

Improve emulator re-initialisation #872

Improve emulator re-initialisation #872

Uh oh!

Conversation

radka-j commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Oct 6, 2025

Uh oh!

sgreenbury commented Oct 6, 2025

Uh oh!

radka-j commented Oct 6, 2025

Uh oh!

sgreenbury commented Oct 6, 2025

Uh oh!

radka-j commented Oct 13, 2025

Uh oh!

radka-j commented Oct 14, 2025

Uh oh!

Uh oh!

sgreenbury commented Oct 14, 2025

Uh oh!

radka-j commented Oct 14, 2025

Uh oh!

radka-j commented Oct 14, 2025

Uh oh!

radka-j commented Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgreenbury left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

radka-j commented Oct 6, 2025 •

edited

Loading